16 research outputs found

    Time Evolution and Predictability of Social Behavior in Techno-Social Networks

    Get PDF
    El fet que cada vegada disposem de més dades socials de sistemes socio-tecnològics---sistemes que registren la nostra activitat diària, tals com a registres de targeta de crèdit, registres de trucades telefòniques, correu electrònic, etc.---i les xarxes socials on-line---com facebook, twitter, instagram, etc.---, ha fet possible estudiar el comportament humà des de diferents perspectives. Descobrir els patrons darrere d'aquestes dades no només aportarà un millor coneixement de la societat, sinó que també beneficiaria a la societat en diferents aspectes, com l'adaptació de tecnologia a les necessitats socials o el disseny de millors polítiques per evitar la propagació d'epidèmies. L'objectiu d'aquesta tesi és precisament descobrir patrons estructurals i temporals en els sistemes socials i desenvolupar models predictius sobre la seva base. En particular, analitzem l'evolució a llarg termini en una xarxa de correu electrònic amb més d'1.000 persones al llarg de quatre anys consecutius. Veiem que, encara que l'evolució de la comunicació entre usuaris és altament impredictible, l'evolució macro de les xarxes de comunicació social segueix lleis estadístiques ben definides, caracteritzades pel decaïment exponencial de les variacions logarítmicas del pes de les comunicacions entre usuaris i del pes dels individus a la xarxa. Al mateix temps, trobem que els individus tenen una forma característica de comunicar-se, i aquesta no canvia en anys. Quant a la predictabilidad, desenvolupem dos models basats en xarxes: un model de recomanació (que prediu votacions d'usuaris sobre objectes) i un model d'inferència temporal (que prediu successos en el temps). El nostre model de recomanació és escalable i considerablement més precís en les seves prediccions que els algorismes actuals per bases de dades de milions de votacions. L'enfocament es basa en la suposició que hi ha grups de persones i d'articles (per exemple, pel·lícules, llibres, etc.) i que les preferències d'un individu sobre un element donat depenen del grups als que pertanyin. Però a més, permet que cada individu i cada article pertanyin simultàniament a diferents grups. Les comunitats superposades resultants i les prediccions sobre les votacions poden inferir-se amb un algorisme escalable de maximització d'expectatives basat en una aproximació variacional. En el moEl hecho que cada vez dispongamos de más datos sociales de sistemas socio-tecnológicos---sistemas que registran nuestra actividad diaria, tales como registros de tarjeta de crédito, registros de llamadas telefónicas, correo electrónico, etc.---y las redes sociales on-line---como facebook, twitter, instagram, etc.---, ha hecho posible estudiar el comportamiento humano desde diferentes perspectivas. Descubrir los patrones detrás de estos datos no sólo aportará un mejor conocimiento de la sociedad, sino que también beneficiaría a la sociedad en diferentes aspectos, como la adaptación de la tecnología a las necesidades sociales o el diseño de mejores políticas para evitar la propagación de epidemias. El objetivo de esta tesis es precisamente descubrir patrones estructurales y temporales en los sistemas sociales y desarrollar modelos predictivos en base a ellos. En particular, analizamos la evolución a largo plazo en una red de correo electrónico con más de 1.000 personas a lo largo de cuatro años consecutivos. Vemos que, aunque la evolución de la comunicación entre usuarios es altamente impredecible, la evolución macro de las redes de comunicación social sigue leyes estadísticas bien definidas, caracterizadas por el decaimiento exponencial de las variaciones logarítmicas del peso de las comunicaciones entre usuarios y del peso de los individuos en la red. Así mismo, encontramos que los individuos presentan una forma caracteristica de comunicarse, y esta no cambia en años. En cuanto a la predictibilidad, desarrollamos dos modelos basados en redes: un modelo de recomendación (que predice votaciones de usuarios sobre objetos) y un modelo de inferencia temporal (que predice sucesos en el tiempo). Nuestro modelo de recomendación es escalable y considerablemente más preciso en sus predicciones que los algoritmos actuales para bases de datos de millones de votaciones. El enfoque se basa en la suposición de que hay grupos de personas y de artículos (por ejemplo, películas, libros, etc.) y que las preferencias de un individuo sobre un artículo dado dependen de los grupos a los que pertenezcan. Pero además, permitimos que cada individuo y cada artículo pertenecan simultáneamente a diferentes grupos. Las comunidades superpuestas resultantes y las predicciones sobre las votaciones pueden inferirse con un algoritmo de maximizThe increasing availability of social data sources from socio-technological systems ---systems that record our daily activity such as credit card records, call-phone records, email, etc.--- and on-line social networks ---such as facebook, twitter, instagram, etc.---, has made it possible to study human behavior from different perspectives. Uncovering the patterns behind this data would not only give us a better knowledge about our society but could also benefit our society in a number of ways such as adapting technology to social needs or design better policies to avoid spread of epidemics. The aim of this thesis is precisely to uncover both structural and temporal patterns in social systems and to develop predictive models based on them. In particular, we analyze the long-term evolution in an email network with over 1,000 individuals throughout four consecutive years. We find that, although the evolution of individual ties is highly unpredictable, the macro-evolution of social communication networks follows well-defined statistical laws, characterized by exponentially decaying log-variations of the weight of social ties and of individuals' social strength. At the same time, we find that individuals have social signatures that are remarkably stable over the scale of several years. Regarding predictability, we develop two network-based models: a recommender model, and a temporal inference model. Our recommender model makes scalable predictions and is considerably more accurate than current algorithms for large datasets. The approach is based on the assumption that there are groups of individuals and of items (e.g. movies, books, etc.), and that the preferences of an individual for an given item depend on their group memberships. Importantly, we allow each individual and each item to belong simultaneously to different groups. The resulting overlapping communities and the predicted preferences can be inferred with a scalable expectation-maximization algorithm based on a variational approximation. In the temporal inference model users can belong simultaneously to different groups, but also the time intervals belong to overlapping communities. The results suggest that the algorithm is able to distinguish real events of non-events almost perfectly

    Treatment with tocilizumab or corticosteroids for COVID-19 patients with hyperinflammatory state: a multicentre cohort study (SAM-COVID-19)

    Get PDF
    Objectives: The objective of this study was to estimate the association between tocilizumab or corticosteroids and the risk of intubation or death in patients with coronavirus disease 19 (COVID-19) with a hyperinflammatory state according to clinical and laboratory parameters. Methods: A cohort study was performed in 60 Spanish hospitals including 778 patients with COVID-19 and clinical and laboratory data indicative of a hyperinflammatory state. Treatment was mainly with tocilizumab, an intermediate-high dose of corticosteroids (IHDC), a pulse dose of corticosteroids (PDC), combination therapy, or no treatment. Primary outcome was intubation or death; follow-up was 21 days. Propensity score-adjusted estimations using Cox regression (logistic regression if needed) were calculated. Propensity scores were used as confounders, matching variables and for the inverse probability of treatment weights (IPTWs). Results: In all, 88, 117, 78 and 151 patients treated with tocilizumab, IHDC, PDC, and combination therapy, respectively, were compared with 344 untreated patients. The primary endpoint occurred in 10 (11.4%), 27 (23.1%), 12 (15.4%), 40 (25.6%) and 69 (21.1%), respectively. The IPTW-based hazard ratios (odds ratio for combination therapy) for the primary endpoint were 0.32 (95%CI 0.22-0.47; p < 0.001) for tocilizumab, 0.82 (0.71-1.30; p 0.82) for IHDC, 0.61 (0.43-0.86; p 0.006) for PDC, and 1.17 (0.86-1.58; p 0.30) for combination therapy. Other applications of the propensity score provided similar results, but were not significant for PDC. Tocilizumab was also associated with lower hazard of death alone in IPTW analysis (0.07; 0.02-0.17; p < 0.001). Conclusions: Tocilizumab might be useful in COVID-19 patients with a hyperinflammatory state and should be prioritized for randomized trials in this situatio

    Long-Term Evolution of Email Networks: Statistical Regularities, Predictability and Stability of Social Behaviors.

    No full text
    In social networks, individuals constantly drop ties and replace them by new ones in a highly unpredictable fashion. This highly dynamical nature of social ties has important implications for processes such as the spread of information or of epidemics. Several studies have demonstrated the influence of a number of factors on the intricate microscopic process of tie replacement, but the macroscopic long-term effects of such changes remain largely unexplored. Here we investigate whether, despite the inherent randomness at the microscopic level, there are macroscopic statistical regularities in the long-term evolution of social networks. In particular, we analyze the email network of a large organization with over 1,000 individuals throughout four consecutive years. We find that, although the evolution of individual ties is highly unpredictable, the macro-evolution of social communication networks follows well-defined statistical patterns, characterized by exponentially decaying log-variations of the weight of social ties and of individuals' social strength. At the same time, we find that individuals have social signatures and communication strategies that are remarkably stable over the scale of several years

    Predictability of logarithmic growth rates for connection weight <i>r</i><sub><i>ω</i></sub>(<i>t</i> + 1) (A, C, E) and user strength <i>r</i><sub><i>s</i></sub>(<i>t</i> + 1) (B, D, F).

    No full text
    <p>(<b>A</b>) Joint probability density of <i>r</i><sub><i>ω</i></sub>(<i>t</i> + 1), the logarithmic growth rate of weights at time <i>t</i> + 1, and <i>r</i><sub><i>ω</i></sub>(<i>t</i>), the logarithmic growth rate of weights at time <i>t</i>. (<b>B</b>) Joint probability density of <i>r</i><sub><i>s</i></sub>(<i>t</i> + 1), the logarithmic growth rate of strengths at time <i>t</i> + 1, and <i>r</i><sub><i>s</i></sub>(<i>t</i>), the logarithmic growth rate of strengths at time <i>t</i>. (<b>C</b>) Joint probability density of <i>r</i><sub><i>ω</i></sub>(<i>t</i> + 1), the logarithmic growth rate of weights at time <i>t</i> + 1, and <i>ω</i>(<i>t</i>), the weight at time <i>t</i>. The area shaded in grey area is no allowed since <i>r</i><sub><i>ω</i></sub>(<i>t</i> + 1)≥ − log <i>ω</i>(<i>t</i>). (<b>D</b>) Joint probability density of <i>r</i><sub><i>s</i></sub>(<i>t</i> + 1), the logarithmic growth rate of strengths at time <i>t</i> + 1, and <i>s</i>(<i>t</i>), the strength at time <i>t</i>. The area shaded in grey is forbidden since <i>r</i><sub><i>s</i></sub>(<i>t</i> + 1)≥ − log <i>s</i>(<i>t</i>). In plots (<b>A</b>-<b>D</b>), circles and error bars show the mean and one standard error of the mean for values binned along the X axis. It is visually apparent that <i>ω</i>(<i>t</i>) and <i>s</i>(<i>t</i>) are more informative about <i>r</i><sub><i>ω</i></sub>(<i>t</i> + 1) and <i>r</i><sub><i>s</i></sub>(<i>t</i> + 1), respectively, than <i>r</i><sub><i>ω</i></sub>(<i>t</i>) and <i>r</i><sub><i>ω</i></sub>(<i>t</i>) (as confirmed by Spearman’s <i>ρ</i> and p-values, displayed inside each graph). (<b>E</b>, <b>F</b>) Root mean squared error (MSE) of the predictions of the logarithmic growth rates at time <i>t</i> + 1 obtained from leave-one-out experiments. As predictors, we use: (<b>E</b>) <i>ω</i>(<i>t</i>), <i>r</i><sub><i>ω</i></sub>(<i>t</i>), and <i>μ</i><sub><i>ω</i></sub>(<i>t</i>) (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146113#pone.0146113.e014" target="_blank">Eq (5)</a>); (<b>F</b>) <i>s</i>(<i>t</i>), <i>r</i><sub><i>s</i></sub>(<i>t</i>), and <i>μ</i><sub><i>s</i></sub>(<i>t</i>) (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146113#pone.0146113.e005" target="_blank">Eq (3)</a>). Additionally, in both cases we try to predict the logarithmic growth rate using a Random Forest regressor [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146113#pone.0146113.ref029" target="_blank">29</a>]. Note that a simple approach (i.e. considering the weight/strength at time <i>t</i>) performs significantly better than a well-performing machine learning algorithm such as the Random Forest. In any case, and despite being the most predictive, weight/strength at time <i>t</i> only provide moderate improvements over predictions made using the mean value <i>μ</i><sub><i>ω</i></sub> for all connections and <i>μ</i><sub><i>s</i></sub> for all users.</p

    Long term email communication data within an organization

    No full text
    <p>Undirected email correspondence between users of a large organization with over 1,000 individuals for four consecutive years (2007-2010). For this period, we have information of the sender, the receiver and the total amount of emails sent within the organization using the corporate email address. To preserve users' privacy, individuals are completely anonymized and we do not have access to email content (see Ethics statement).</p> <p>The data is in the following format:</p> <p>user1ID user2ID #emails</p> <p>Where #emails is the total amount of emails exchanged (sent and received) in one natural year. The files are separated by years.</p> <p>Ethics statement:</p> <p>This data is exempt from IRB review because: i) The research involves the study of existing data--email logs from 2007 to 2010, which the IT service of the organization archived routinely, as mandated by law; ii) The information is recorded by the investigators in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects. Indeed, subjects were assigned a "hash" by the IT service prior to the start of our research, so that none of the investigators can link the "hash" back to the subject. We have no demographic information of any kind, so de-anonymization is also impossible. </p

    Time evolution of nodes’ strengths.

    No full text
    <p>The strength <i>s</i><sub><i>i</i></sub> of node <i>i</i> is the number of emails that user <i>i</i> exchanged with other users during one year. (<b>A</b>) Distributions of strengths for each one of the years in our dataset (2007-2010). Note that the distribution is stable in time. (<b>B</b>) Distribution of centered strength logarithmic growth rates <math><mrow><msubsup><mi>r</mi><mi>s</mi><mn>0</mn></msubsup><mo>=</mo><mo>log</mo><mrow><mo>(</mo><mi>s</mi><mrow><mo>(</mo><mi>t</mi><mo>+</mo><mo>Δ</mo><mi>t</mi><mo>)</mo></mrow><mo>)</mo></mrow><mo>-</mo><mo>log</mo><mrow><mo>(</mo><mi>s</mi><mrow><mo>(</mo><mi>t</mi><mo>)</mo></mrow><mo>)</mo></mrow><mo>-</mo><mi>μ</mi><mrow><mo>(</mo><mi>t</mi><mo>,</mo><mo>Δ</mo><mi>t</mi><mo>)</mo></mrow></mrow></math> for Δ<i>t</i> = 1, 2, 3 years (dots, squares and diamonds, respectively). Lines show fits to a Laplace distribution (parameters Δ<i>t</i> = 1: <i>σ</i><sub>exp</sub> = 0.57, Δ<i>t</i> = 2: <i>σ</i><sub>exp</sub> = 0.74 and Δ<i>t</i> = 3: <i>σ</i><sub>exp</sub> = 0.83). Note that as Δ<i>t</i> increases the distributions are wider (see Fig D in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146113#pone.0146113.s002" target="_blank">S2 File</a>). For the specific values of the distribution modes <i>μ</i>(<i>t</i>, Δ<i>t</i>) see Fig B in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146113#pone.0146113.s002" target="_blank">S2 File</a>.</p